72 research outputs found
Bankruptcy Prediction of Small and Medium Enterprises Using a Flexible Binary Generalized Extreme Value Model
We introduce a binary regression accounting-based model for bankruptcy
prediction of small and medium enterprises (SMEs). The main advantage of the
model lies in its predictive performance in identifying defaulted SMEs. Another
advantage, which is especially relevant for banks, is that the relationship
between the accounting characteristics of SMEs and response is not assumed a
priori (e.g., linear, quadratic or cubic) and can be determined from the data.
The proposed approach uses the quantile function of the generalized extreme
value distribution as link function as well as smooth functions of accounting
characteristics to flexibly model covariate effects. Therefore, the usual
assumptions in scoring models of symmetric link function and linear or
pre-specied covariate-response relationships are relaxed. Out-of-sample and
out-of-time validation on Italian data shows that our proposal outperforms the
commonly used (logistic) scoring model for different default horizons
Some Problems in Model Specification and Inference for Generalized Additive Models
Regression models describingthe dependence between a univariate response and a set of covariates play a fundamental role in statistics. In the last two decades, a tremendous effort has been made in developing flexible regression techniques such as generalized additive models(GAMs) with the aim of modelling the expected value of a response variable as a sum of smooth unspecified functions of predictors. Many nonparametric regression methodologies exist includinglocal-weighted regressionand smoothing splines. Here the focus is on penalized regression spline methods which can be viewed as a generalization of smoothing splines with a more flexible choice of bases and penalties. This thesis addresses three issues. First, the problem of model misspecification is treated by extending the instrumental variable approach to the GAM context. Second, we study the theoretical and empirical properties of the confidence intervals for the smooth component functions of a GAM. Third, we consider the problem of variable selection within this flexible class of models. All results are supported by theoretical arguments and extensive simulation experiments which shed light on the practical performance of the methods discussed in this thesis.EThOS - Electronic Theses Online ServiceGBUnited Kingdo
Recommended from our members
A Unifying Switching Regime Regression Framework with Applications in Health Economics
Motivated by three health economics-related case studies, we propose a unifying and
flexible modelling framework in the context of the utility-based Roy model of switching
regimes. The proposal can handle the peculiar distributional shapes of the considered
outcomes via a vast range of marginal distributions, allows for a wide variety of copula
dependence structures and permits to specify all model parameters (including the depen-
dence parameters) as flexible functions of covariate effects. The algorithm is based on a
computationally efficient and stable penalised maximum likelihood estimation approach
with integrated automatic multiple smoothing parameter selection. Inferential results are
also readily available. The proposed modelling framework is evaluated using simulated
data and employed for three applications in health economics, that use data from the
Medical Expenditure Panel Survey, where novel patterns are uncovered. The new frame-
work has been incorporated in the R package GJRM, hence allowing any user to fit the
desired model(s) and produce easy-to-interpret numerical and visual summaries
Single and multiple-group penalized factor analysis: a trust-region algorithm approach with integrated automatic multiple tuning parameter selection
Penalized factor analysis is an efficient technique that produces a factor loading matrix with many zero elements thanks to the introduction of sparsity-inducing penalties within the estimation process. However, sparse solutions and stable model selection procedures are only possible if the employed penalty is non-differentiable, which poses certain theoretical and computational challenges. This article proposes a general penalized likelihood-based estimation approach for single and multiple-group factor analysis models. The framework builds upon differentiable approximations of non-differentiable penalties, a theoretically founded definition of degrees of freedom, and an algorithm with integrated automatic multiple tuning parameter selection that exploits second-order analytical derivative information. The proposed approach is evaluated in two simulation studies and illustrated using a real data set. All the necessary routines are integrated into the R package penfa
A spline-based framework for the flexible modelling of continuously observed multistate survival processes
Multistate modelling is becoming increasingly popular due to the availability of richer longitudinal health data. When the times at which the events characterising disease progression are known, the modelling of the multistate process is greatly simplified as it can be broken down in a number of traditional survival models. We propose to flexibly model them through the existing general link-based additive framework implemented in the R package GJRM. The associated transition probabilities can then be obtained through a simulation-based approach implemented in the R package mstate, which is appealing due to its generality. The integration between the two is seamless and efficient since we model a transformation of the survival function, rather than the hazard function, as is commonly found. This is achieved through the use of shape constrained P-splines which elegantly embed the monotonicity required for the survival functions within the construction of the survival functions themselves. The proposed framework allows for the inclusion of virtually any type of covariate effects, including time-dependent ones, while imposing no restriction on the multistate process assumed. We exemplify the usage of this framework through a case study on breast cancer patients
Beyond unidimensional poverty analysis using distributional copula models for mixed ordered-continuous outcomes
Poverty is a multidimensional concept often comprising a monetary outcome and
other welfare dimensions such as education, subjective well-being or health,
that are measured on an ordinal scale. In applied research, multidimensional
poverty is ubiquitously assessed by studying each poverty dimension
independently in univariate regression models or by combining several poverty
dimensions into a scalar index. This inhibits a thorough analysis of the
potentially varying interdependence between the poverty dimensions. We propose
a multivariate copula generalized additive model for location, scale and shape
(copula GAMLSS or distributional copula model) to tackle this challenge. By
relating the copula parameter to covariates, we specifically examine if certain
factors determine the dependence between poverty dimensions. Furthermore,
specifying the full conditional bivariate distribution, allows us to derive
several features such as poverty risks and dependence measures coherently from
one model for different individuals. We demonstrate the approach by studying
two important poverty dimensions: income and education. Since the level of
education is measured on an ordinal scale while income is continuous, we extend
the bivariate copula GAMLSS to the case of mixed ordered-continuous outcomes.
The new model is integrated into the GJRM package in R and applied to data from
Indonesia. Particular emphasis is given to the spatial variation of the
income-education dependence and groups of individuals at risk of being
simultaneously poor in both education and income dimensions
- …